perm filename ANS.420[P,JRA]2 blob
sn#556268 filedate 1981-01-12 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00017 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00003 00002 .HLINES← 68 << 49 NUMBER OF LINES/PAGE >>
C00006 00003 Functions: mathematics, computation, and deduction
C00027 00004 Functions as Data Objects
C00037 00005 LISP
C00055 00006 Functional Bones
C00065 00007 Active Functional Objects
C00073 00008 .comment .Imperative features and non-rec. control
C00074 00009 Property lists
C00076 00010 Functional Flesh
C00078 00011 Data-Driven Programming
C00085 00012 Object-Oriented Programming
C00088 00013 More Property Lists: Hierarchies
C00091 00014 Frame Languages
C00095 00015 Constraints
C00101 00016 .comment implementation issues
C00102 00017 Bibliography
C00105 ENDMK
C⊗;
.HLINES← 68 << 49 NUMBER OF LINES/PAGE >>
.WCHARS← 48 << 81 NUMBER OF CHARS/LINE >>
.turn on "↓#"
.
.
.PAGE FRAME HLINES+2 HIGH WCHARS WIDE
.AREA TEXTER LINES 1 TO HLINES CHARS 1 TO WCHARS
.PLACE TEXTER
.begin verbatim
A VIEW OF LISP, FUNCTIONS, OBJECTS, FRAMES, AND CONSTRAINTS:
FUNCTIONAL FLESH, FUNCTIONAL BONES
By
John R. Allen
The LISP Company
POB 487
Redwood Estates, CA. 95044
.end
Note. This paper summarizes material to be presented at the
LISP/object-oriented tutorial at the faire. Many of the technical details
have been omitted and, as a consequence, some of the discussions will
require substantial elaboration and patience on the part of the reader. If
you do persist you will become more acquainted with the following:
Abstract. An overview of computation and deduction leads to a discussion
of the use of function in computing. We discuss the representation of
functional objects in languages, first as passive data structures, then as
active data. This leads to the possibility of a a meta-circular
description of a computing agent for the language, in the language itself:
this is a key result of the LISP-style of programming language. This
initial material --the "Bones"-- prepares us to understand the "Flesh"
--object-oriented programming, frame-based languages, and constraint-based
systems. We show that Visicalc can be understood as a simple application
of programming with constraints.
. comment <<the bankruptcy notation argument should come here>>
. Notation
. The role of notation in science
. expressivity
. subordination of detail
. economy
. amenability to proof
. The impact of computing on notation
. executability
. representability of algorithms
. The difficulties with programming languages
.;
Functions: mathematics, computation, and deduction
Functional programming? As the name implies, it means "programming with
functions". Of course that is almost a no-content answer; however one
should ask how often one "programs with functions" in "traditional
programming". Unfortunately most languages view functionality as a poor
relation (no pun intended), basing most programming techniques on
operations that relate closely to those found on traditional computing
engines --the von Neumann machines: the result-to-memory instruction, the
implied increment-program-counter, the conditional jump, the dreaded goto.
Unfortunately those constructs are not very clean either mathematically or
practically. Ignoring the mathematics for the moment, these operations are
much too low-level to act as acceptable structuring devices for complex
processes. (Aside: At a more fundamental level, we reject the notion that
the mechanization of all process and data in the flat world of bit-strings
is adequate --it is "universal", but hardly adequate intellectually.) To
compound the misfortune, most high-level languages simply sugar-coat the
base machine, superficially disguising these instructions as: the
assignment statement, sequential execution, the if-then-else statement,
and the dreaded go-to. These languages --what Backus[3] calls the von
Neumann languages-- include Fortran, Algol, Pascal, Basic, and the
FDA-approved ADA, and COBOL. This the the "bottom-up" approach to
programming languages.
Another branch of language design began in a "top-down" fashion, viewing a
programming language primarily as a notation for expressing ideas and
secondarily as a computational engine. [A sighed: Unfortunately, most of
these efforts didn't consider that a "computational" notation should be
easy for a machine to manipulate.] These languages include APL, LISP,
logic programming, data flow, Smalltalk, and most recently the FP-related
languages of Backus. All of these languages share a common thread: they
began as notations, driven by concerns for expressing computational ideas,
with concerns for execution and implementation taking a secondary role.
The interesting --though perhaps not suprising-- result of investigating
these languages is that a major portion of their power and elegance is due
to their explicit or implicit treatment of "function". That is, the
mathematical elegance is inherited by the practical language. It's not
suprising considering that many centuries of study and effort have gone
into refining the concepts of mathematics.
So one returns to the idea of "function" as a basis for a computational
notation, though one must be careful since "function" as an abstraction is
computation-independent. Recall that a function is simply a set of ordered
pairs; an algorithm is a computational rule --a procedure-- calculating
the values of a function. One may have many algorithms for computing the
same function.
What does it mean to compute "as if" one was computing with functions?
What is necessary computationally to describe a functional unit? What
"useful" (when measured against traditional approaches) computing can be
done with "functional" languages. Universality arguments --in the sense
of whether one language can compute something that another language
cannot-- are not considered "useful" here, by the way. Any non-toy
language is universal; we're talking about practical expressibility.
Let's begin by addressing the issue of "functionality" from a
computational point of view. Given a functional equation like:
F(x,y) = 2x + 4y,
high school algebra tells us that F(3,4) is 22. The rules we used for
computing that value can be classified as either Substitution rules or
Simplification rules.
For example, one may substitute 3 for x and 4 for y throughout the
definition of F:
F(3,4) = 2*3 + 4*4, where we have to write an explicit multiplication 2*3,
not 23.
now we apply:
simplification rules:
replace 2*3 by its "simpler" form, 6, giving
F(3,4) = 6 + 4*4; simplification now allows us to replace 4*4 by 16.
F(3,4) = 6 + 16; simplification now allows us to replace 6+16 by 22.
Besides these simplification rules, we have implicitly applied rules for
equality; for example, that if α=β and β=∂, then α=∂.
It can be shown that computation, as we know it, derives from a collection
of substitution and simplification rules, coupled with the of laws for
equality relation. The lambda calculus --a formal logical system--
investigates the essential properties of "functionality" and its
computational interpretation. This system is expressed as a collection of
axioms and rules much like those for elementary geometry --which
formalizes "points" and "lines"--, or symbolic logic --which formalizes
"reasoning"--, only here the informal notion that is formalized is
"function". The informal notion of "computation" is related to (but not
identical to) the notion of proof in this system That is, if one can
exhibit a proof for a statement using the rules of the calculus, then one
can interpret the proof steps as a "computation" of that statement.
For example, given F(x,y) = 2*x + 4*y then:
.begin verbatim
1. F(3,4) = 2*3 + 4*4 subst. rule
2. 2*3 + 4*4 = 6 + 4*4 simplification
3. F(3,4) = 6 + 4*4 equality rule
4. 6 + 4*4 = 6 + 16 simplification
5. F(3,4) = 6 + 16 equality rule
6. 6 + 16 = 22 simplification
7. F(3,4) = 22 equality rule
.end
So we have "proved" that F(3,4) = 22. Of course, we have used these rules
informally here, but they may be made precise.
Much like functionality abstracts to the idea of a set of ordered pairs
without requiring a computational rule, so too, axiomatic systems do not
specify an order in which rules are to be applied. As long as a rule is
applicable, it may be applied.
For example, instead of Step 2 above, we could have applied
2'. 2*3 + 4*4 = 2*3 +16 simplification
and continued the revised proof to the same result.
Two of the traditional strategies for "function evaluation" are easily
expressed in this notion of applying rules: Call-by-value corresponds to
the strategy of always simplifying arguments before applying a
substitution rule
Given F(x,y) = 2*x + 4*y then prove F(3,3*2)=30
1. 3*2 =6 simplification
2. F(3,3*2) = F(3,6) equality rule
... rest of proof ...
Call-by-name corresponds to the strategy of substitution before
simplification.
1. F(3,3*2) = 2*3 + 4*(3*2) substitution
2. 2*3 + 4*(2*3) = 6 + 4*(3*2) simplification
3. F(3,3*2) = 6 + 4*(3*2)
... rest of proof ...
In many cases, the order of rule application is immaterial; the same
"computation" will result. However, in some important cases, the order of
calculation (or evaluation) does make a difference. Assume we have an
expression --call it "delta"-- in our language whose calculation fails to
terminate: an endless loop, for example. Consider the function G(x,y) =
2*y; i.e. a function that ignores its first argument and doubles its
second.
Now compute G(delta,3). If we simplify (compute, calculate, or evaluate)
the arguments first --call-by-value-- the computation of G(delta,3) will
not terminate because the computation of delta fails to terminate;
however, if we substitute before attempting to simplify --call-by-name--
then G(delta,3) will result in 2*3, which simplifies to 6. An important
source of "delta"-like expressions in ordinary computing languages
involves the conditional expression: "if p is true then a else b", which
we shall write as if(p,a,b). This construct is defined such that one
first evaluates p and, if that value is true, we evaluate a; if the value
of p is false, we evaluate b; we never evaluate both a and b.
For example, F(x) = if(x=0, 1, x*F(x-1)) is a way of representing the
ubiquitous factorial function in this notation. (We can construct a
"delta" expression with F(-1) for example)
Almost every contemporary programming language contains such a conditional
construct and programmers deal with them without noticeable discomfort,
yet their introduction complicates the formal system. What should we add
to our system to simplify expressions involving "if"-expressions. Here are
some candidates:
if(true, a, b) simplifies (or reduces to) a.
if(false, a, b) reduces to b.
These seem natural, but what should one do with "if(p, a, a)"? should we
add a simplification rule: if(p, a, a) = a? For p can only compute a true
value or a false value, and in either case we get a. We perhaps can
dispense with the computation of p. But what if p is a dreaded delta
expression and therefore doesn't terminate? Since we are supposed to
compute the value of p first and, in this case, that computation will not
terminate, then the if-construct will not terminate rather than giving the
value a.
We are now moving from the dry land of functionality, into the swamp of
computation: the order in which operations are carried out can effect the
result --a non-functional, process-oriented idea. Unfortunately, or
perhaps fortunately, computation is not trivial to describe and delimit.
However, one can make a general statement about the phenomenon of
computation:
In the most general setting one can view general computation as the
application of well defined rules --deduction--, plus a strategy for the
application of those rules --control information.
This "control information" is in the class of "heuristics" or strategies
that one may use to guide the application of the rules of the formal
system. In our computational domain, this control information includes
such strategies as call-by-value and call-by-name.
The "deductive system" is a mathematical object --a formal system obeying
rules of logic. One attractive view of computation says that one should
view the appropriate target for problem specification as the formal,
deductive system; that is, the programming problem is one of specifying
the logical system --axioms and rules of inference-- that characterize the
phenomenon. The role of the control information is that of a guidance
system that will propel the deduction down one of perhaps many paths
toward a solution. One effect of this approach is to split the
specification of a program into two segments: (1) a logic component that
describes the static interrelationships between the objects (these objects
encapsulate our knowledge about both the active --program-- and the
passive --data-- parts of the endeavor), and (2), a control component that
describes computational strategies. One can view this as a synthesis of
the deductive/procedural view of problem solving. This view of the
programming process is most widely accessible in the school called "Logic
Programming" and is embodied in a programming language known as Prolog.
Kowalski[12] has encapsulated this view in the phrase "Algorithm = Logic +
Control".
It is useful to compare what might be called the Pascal school: they can
be characterized by the motto: "Algorithm = Program + Data"; that is, the
factoring of an algorithm into an active and a passive component. The
preceding argument casts doubt on the validity of the split from a
theoretical perspective: though one might believe that the control issue
is soley one of Program, not Data, there are components of Logic in both
Program and Data. However, from an engineering point of view, the Pascal
approach is at least seductive; alas, as we will see in a few moments, the
distance between active (program) and passive (data) is not at all
clear-cut even practically speaking; we will show good reason why one
should view many (if not all) contemporary data structures as active
objects rather than passive objects. Finally, as a banner for a
substantial programming methodology the bifurcation of computation into
mathematics (logic) and process (control) has a much more elegant ring. [A
snide: even when viewing the programming process as an engineering
activity we have difficulty with the Pascal "discipline"; development of
complex programs is a creative, exploratory process, more artistic that
mechanical. As Alan Perlis says "Everything should be designed top-down
--except the first time." A lot of "first time" programming --like AI
programming-- gets done in LISP].
.comment
. Computation and interaction
. The relationship between language and its medium
. why computing languages cannot be separated
. from the programming env
. cf. conversation and understanding
. The polarization between interaction and discipline
. pascal vs. lisp vs. ucsd pascal vs smalltalk
. The polarization between rigor and hacking
. basic vs. (lisp/scheme)
.;
Functions as Data Objects
Let this esoteric business of "what is a function" settle for awhile (we
will return to it) and let's see what one can do with functions as "data
objects". We see "data objects" in all languages; Fortran, Pascal, and
LISP have numbers as data objects. The meaning is clear: we can compute
with them as values. What does it mean to compute with functions as
objects? Well, just as in the Fortran-Pascal case, where one has
functions that operate on data (numbers), we should have functions that
operate on data (now functions). Functions that operate on functions are
called "functionals".
How to view "function" as a "data object"? One can do this in two ways.
First, one can consider functions as completed entities --sets of ordered
pairs. Of course, most interesting functions are infinite sets, so one
either deals with finite functions or deals with finite subsets of the
function.
I first saw this application of functionality exploited in computing with
the Culler-Fried system in the early 1960's. The data items in this
language were finite segments of functions. One performed computations by
applying operators to these functions. These operators were invoked in a
display-based interactive environment, using single key-strokes.
For example, to compute the parabolic function y = 2*x↑2 one would
.begin verbatim
1.Strike the ID key to generate the identity
function on the interval [1-,1] (the range could
be normalized)
2.Strike the STORE key, followed by X.
(steps 1 and 2 could be skipped if X was already
loaded with a function.)
3. Strike the LOAD key folowed by X
4. Strike the SQUARE key.
5. Strike the PRODUCT key
6. Strike the 2 key (to generate the constant
function)
7. Strike STORE followed by Y
8. Strike DISPLAY follwoed by X and Y to view
the result.
.end
Functions could be displayed parametrically, giving complex mathematical
results. Functionals could be constructed and stored under keys; one
could even use the display to generate graphic fonts to associate with the
functionals. The vision and insight in this system was truly remarkable:
highly interactive, functional programming, graphically based; these are
only a few of the features. Remember, this system was operational in 1962
when most of the world was still punching cards!
The second view of "functional object" involves an implicit representation
--a formula-- to indicate a potentially infinite set. For example the
pair of equations:
F(0) =1
F(x+1) = x+1*F(x) represents the infinite set of pairs that expresses the
factorial function. Now when we come to represent this functional
quantity in a traditional computer, we typically give this functional
equation a "process" or computational interpretation and express it in the
more usual algorithmic notation like:
F(x) = if(x=0, 1, x*F(x-1)). One can then view this equation as either a
functional or algorithmic specification. In the general case, there can be
a difference in the two interpretations --in particular when the
algorithmic notation (programming language) allows operations that have
ill-constrained side-effects. However, our initial discussions will only
involve languages for which the functional and algorithmic interpretations
of such equations are the same and we will therfore refer to the
representations of these equations as simply "functional objects".
In a trivial sense, then, almost any contemporary language manipulates
"functional objects": a compiler, for example, will take a data structure
representation of a "function" and manipulate it, producing executable
code. Unfortunately, in most languages, the representation given to
functions is unnecessarily weak. Some languages require such object to be
encoded as numbers; others represent functional quantities as strings;
some allow representation as character arrays. But semantically there is
more structure to a functional representation than any of these views.
Indeed, that should be apparent even at the level of compiler writing: the
first duty of a compiler is to "parse" the external representation into an
internal representation that the analysis and synthesis sections can
manipulate easily. This internal form is always some type of tree
structure.
It is most unfortunate that many languages don't supply a representation
for such tree-like entities. Even fewer supply an effective, pleasing
human-oriented notation for this representation (try writing a record
structure representation of a parse tree for the factorial function!).
Finally, the class of languages that allow these representations to be
executed (! god (of your choice) forbid!!) as well as manipulated is
even smaller.
Reflect on the following: Given that compilers, and language translators
in general, parse programs from an external syntax into a tree-like
representation before translating them, and given that the internal
representation is executable as it stands, why not dispense with that
external syntax entirely and program directly with the internal
representation? The answer is that the internal representation is usually
not human-oriented. But let's assume we have a a notation for that
representation that is simple and pleasing, why not then?
Indeed, why not. This freedom and flexibility is present in those
languages derived from LISP, and in those languages we do program in this
style; so let's take a quick tour of a LISP dialect.
LISP
The key --we will see what it unlocks in a moment-- is to think about an
acceptable, robust representation for "functions" --active units in a
language-- such that we can write programs that manipulate these
representations. As we said above, for easy manipulation of functions we
wish a representation that embodies the structural interrelationships
between the components, and we wish that representation to have a pleasing
notation (incantations to bit patterns or Godel numbers is not
sufficient). We will ignore the third criteria --executability-- for the
moment. So our first quest is "functional data"; "functional programming"
is an orthogonal issue.
So our major (novel, but not only) data structure will a tree structure.
At the tips of the tree will be found atoms --numbers or character
sequences like ABC or A&5 or IS-MUMBLE. The all-seeing arborist has
noticed that trees that only associate atomic quantities with the leaves
are sufficient to represent any kind of tree-structure (actually we can
simplify even more):
.begin verbatim
|
______|___________
| | | |
| ____|___ | |
IF | | | 1 2
EQ X 0
.end
Now to the question of pleasing notation. In the long term we may have
graphical languages that will allow us to manipulate these tree-like
objects directly, but for now we must re-express these structures in a
linear, sequential notation of symbols; and of course, that notation must
exhibit the structuring present in the graph.
Fortunately, there is an elegant notation (and theory) already ensconced
in mathematics: the finite sequence. A finite sequence is a finite,
ordered collection of objects; period. Though mathematics frequently
restricts those objects to be integers, we will allow the objects to be
atoms and finite sequences themselves.
For notation we write an n-element sequence of objects o↓i as:
(o↓1 o↓2 ... o↓n) and for example, the above tree could be written as
(IF (EQ X 0) 1 2).
These representations of finite sequences are called lists, and the
notation is called list notation.
Now we are in position to attack the function-representation problem with
the sequence notation. Every language construct must find representation
as a data item. Assume a toy Algol-ish language named A1 (whose data
domain is simply integers).
.BEGIN VERBATIM
Typical language construct
* Mapping to finite sequence
Begin {<s >;}<s > End
i n
* (BEGIN <S >...<S >)
i n
<var> ← <val>
* (← <VAR> <VAL>)
Function <name> ({<formal >}):<type>(<s >}
i i
* (DEFINE <NAME> (({FORMAL >} <TYPE>) {<S >})
i i
<name>({<p >})
i
* (<NAME> ({<P >})
i
While {<pred>} DO <s>
* (WHILE {<PRED>} <S>)
If <pred> then <s1> else <s2>
* (IF <pred> <S1> <S2>)
Variables
x
* X (the upper-case symbol counterpart)
Constants (yes, they need a represntation)
numbers
* numbers (recall numbers are atoms)
true
* T
false
* NIL
.END
Notice that the representation of A1 constructs is actually more regular
than the surface syntax of A1: the first atom in the sequence
representation tells us what the remainder of the sequence represents;
thus the "End" and the "buzz words in the "if" and the "While" are
unnecessary. Perhaps more pleasing is that all the difficulty of
semi-colon placement between constructs is unnecessary in
the LISP scheme (LISP stands for: Lacks Inane
Semi-colon Problems).
As an example of the
mapping, we give the Algolish form of the factorial function and a
possible translation onto a sequence representation:
.BEGIN VERBATIM
Function Fact (N:integer): integer;
If N=0 then Fact := 1
else Fact := N*Fact(N-1)
(DEFINE FACT (((N INTEGER))INTEGER)
(IF (= N 0)
(← FACT 1)
(← FACT (* N (FACT (SUB1 N)))))
.END
Note that we could dispense with the external syntax of A1 and program
with the internal form. Though foreign to users accustomed to algolish
notation, the syntatic representations only differ marginally. It remains
to demonstrate that there are compelling reasons to shift. Those reasons
will become clear in a moment. Note too, that the issue of programming in
this internal form is totally independent of the issue of whether lists
--finite sequences-- are data structures in A1.
So now our data domain looks reasonable; we can represent typical kinds of
program structures in the domain. The question remains whether we can
write useful algorithms to manipulate these items. This issue involves
the construction of a language to manipulate trees or sequences. Many
language bases will do; a Fortran-ish, Algol-ish, or Basic-ish starting
point would suffice; but sufficiency never did lead to elegance or
quality.
The initial language we pick, called L1, will operate on lists in a
calculator-like fashion: we will present it with an expression to evaluate
and it will return values. Since we wish to view "algorithmic functional
objects" as "mathematical functional objects" we will restrict our
operations to those founded on mathematical rigor. We will allow only
functional composition and application as functional
operations. We will only allow conditional expressions as a control
structure; and as a convenience (though not strictly necessary) we will
allow functions to be named. Finally, we will specify primitive
operations on LISP data structures --these operations are of three basic
classes:
Constructors: operations to make new elements
.begin indent 1,1,1
cons(x;y) --takes x, an arbitrary object, and y, a list, and makes a new
list with x on the front.
list(x↓1; ...x↓n) --takes an arbitrary number of arguments and makes a
sequence from them.
.end
Selectors: operations to select components from composite data structures.
.begin indent 1,1,1
first(x) --takes x, a non-empty list, and produces its first element.
rest(x) --takes x, a non-empty list, and produces the list with the first
element removed.
.end
Recognizers: predicates to examine properties of objects.
.begin indent 1,1,1
atom(x) --gives "true" if x is an atomic object; false otherwise.
listp(x) --gives "true" is x is a list.
null(x) --gives "true" is x is an empty list; false otherwise.
eq(x;y) --a version of the equality predicate: is x=y "true"?
.end
.comment ****much, much more on abstraction ******;
Both the intellectual task and the programming problem are greatly aided
if one thinks of data structures and their operations in terms of
constructors, selectors, and recognizers. The appropriate use of these
ideas is at the heart of the notion of abstract programming.
For example, in defining a program manipulation system to manipulate
representations of A1's constructs, we could define a recognizer for
conditional expressions, and a constructor for conditional expressions as
follows:
.BEGIN VERBATIM
is_if(exp) = if listp(exp)
then eq(first(exp);IF)
else false
make-if(pred; conseq; alter)
= list(IF; pred; conseq; alter)
.END
. comment << *****lisp history*****>>
history of implementations and features they introduced
. 704 (1.5) → 7090 → 7040 gc, funarg, etc
.
. pdp-1 → sds940 → bbn-lisp → interlisp
.
. pdp-6 maclisp (value cell) → 1.6 → uci (1.6+bbn editor) → yale
. → bibop maclisp → lm lisp (mac+ mdl)
.
. → NIL
. → Franz lisp
.
. vlisp(10,11,80)
.
. standard lisp
.
. lisp370
.
. wisconsin/maryland lisp
.
. micro lisps
.comment lisp features
. data objects
. lists
. the traditional lisp data structure. everything is a pointer
. and therefore instead of saying "x is a pointer to a foo",
. we need only say "x is a foo". incredible liberation.
.
. symbols
. aka literal atoms: pointers must stop somewhere, and in lisp these
. symbols are the primitive objects for representation (naming)
.
. numbers
. the traditional primitve object in traditional prog. langs.
. arb. prec.
. an improvement in lisp's arithmetic (c. 1968) numbers as lists
.
. arrays
. the "structured object" in primitive languages
. in lisp, just another tool for data. key: first-class, but
. many lisp's slip up here
.
. strings
. another ....
.
. p-lists
. to some extent lisp's "record structure", but much more flexible
. as basis for simple obj-or. programming. (leads into ...)
.
. functions **
. yes, functions are data items in lisp: very important. source of much
. current controversy.
.
.
. envrionments (as data) NB this is not what is meant by "prog env"
. interesting innovation based on soln to function as data: the symbol
. table defining the environment is available as manipulatable
. data. it is a tree-struture (non-stack) therefore in 1.5 defn
. was a gc-ible commodity.
.
. control (as data)
. differs from above: run-time computational history is open to
. investigation and manipulation. cf. debugging
. this is a freer interpretation than most lisps allow
.
.
.
. hunks (?)
. ***firts-classness here
.;
.comment language features
.language constructs
. syntactic issues: prog as data
. progs are rep as list structure.
. why and when
. jmc and turing machine
. why it is good
. ediiting,debugging, compiling, ... i.e. prog manipuation
. side-step uninteresting syntax questions. semanitcs lang.
.
. control: modulo syntax, traditonal
. do
. if
. <=
. function
. catch/throw
. macros
. read macros
.;
.comment details of lisp prog env.
. prog env.
. interactive language
. read-eval-print of machine
. windows ala smalltalk and LM
.;
. ----------
. lisp-ish languages and features
. planner pdi and related benefits (see prolog)
. conniver context layers (see PIE), a "machine" (see Scheme chip)
. muddle &-syntax, algolish syntax, multiprocessing
. ecl more algolish, extensibility, data type delcarations
. scheme lexical scoping. "chip-able"
. smalltalk(?) object-oriented, classes, windows in env.
.;
Now let's stand back and examine our position.
1. First, we took an Algol-ish language, A1, and mapped its constructs
into the world of lists.
2. Then we noted that we could in fact view the internal form of A1
expressions as executable objects themselves, dispensing with the external
syntax and the associated obnoxious parsing problems.
3. Finally, we concocted a LISP-ish language, L1, and showed how it could
manipulate representations of elements of A1.
Functional Bones
Occam's razor says one language is better than two. Let's pick L1 as the
survivor. Now our mapping process says that we can represent
program elements of L1 as data items of L1 and throw away the surface
syntax of L1! All programming is done in the internal notation --the data
domain--, all data items come out of the same data domain (A "strange
loop" is on the horizon --for those who have read "Godel, Escher, Bach").
All we need do is describe a map of the language constructs onto the data
domain of lists. Actually the existing of map of A1 into lists is
quite appropriate; we need only augment it to handle the constructs that
appear in L1 but not in A1; we will also flush a lot of the A1
constructs: assignment, while, and begin-end are all unnecessary.
The only two constructs that warrant additional discussion are the
representation of list-constants and the representation of functional
objects. Since program elements are now going to be mixed up in the same
domain as data objects, we need a mechanism to distinguish between the USE
of a object as a program construct and the MENTION of it as a data item.
English has the same problem: Chicago is a city --"Chicago" is a seven
letter word. We use a similar quote convention in the mapping, writing
'(PLUS 1 2) [or more precisely (QUOTE (PLUS 1 2))] to mention the list
(PLUS 1 2) in a computation, and write (PLUS 1 2) to apply (use) the PLUS
function in a computation.
The second candidate for mapping --the functional construct-- will be new.
In A1 we always had a functional object associated with a name; in L1 we
want to talk about functional objects independent of any name. The
necessary constitutients of the functional object are: its formal
parameters, and its body. We also need a "reserved" word/symbol to say
"this object is a function"; that word is λ.
Our functional object in L1 is therefore:
λ(({<formal>↓i}) {<exp>↓i}) and its translation into L1 data structures
is:
(LAMBDA (({<FORMAL>↓i}) {<EXP>↓i}). Thus for example we could express the
"squaring"function as:
λ((x)#(times#x#x)) independent of any name, and would write its data
structure representation as:
(LAMBDA (X) (TIMES X X)).
Given the novelty of functional objects as data structures, one should be
able to demonstrate significant applications if this novelty has
substance. It might be that it's another out-dated hold-over from the Von
Newmann era of the "stored program machine". These machines allow the
storage of programs (bit strings) in the same memory space as that used by
data objects (bit strings); it is not the contents of a memory cell that
determines whether it is program or data, but rather that determination is
a result of how the central processor accesses the memory cell. [By the
way, the major hallmark of von Neumann machines is, to me, the "stored
program concept" not the particular kind of dreadful language that we
continue to stuff into them]
Note that we are on the same path here, representing programs as data
objects. The difference is that our primitive components are more
structured than the simple bit-string. We reduce program and data to
tree-structured quantities. To complete the process, we need to describe
a processing unit that will manipulate the data and execute the programs
(but program execution is only a special case of data manipulation).
.begin verbatim
(DE EVAL (X ENV)
(COND ((IS-CONST X) (VAL X))
((IS-VAR X) (LOOKUP X ENV))
((IS-IF X) (IF (EVAL (PREM X) ENV)
(EVAL (CONSE X) ENV)
(EVAL (ANTE X) ENV)))
(T (APPLY (EVAL (FUN X) ENV)
(EVLIS (ARGS X) ENV)
ENV))))
(DE APPLY (FN EARGS ENV)
(COND ((IS-PRIM FN) (DOIT FN EARGS))
((IS-LAMBDA FN) (EVAL (BODY FN) ENV))))
(MK-ENV (FORM FN)
EARGS
ENV)))
(DE EVLIS (L ENV)
(IF (NULL L)
( )
(CONCAT (EVAL (FIRST L) ENV)
(EVLIS (REST L) ENV))))
(DE LOOKUP (N ST)
(COND ((NULL ST) error)
(T (LOOKUP1 (NAMES (FIRST ST))
(VALUES (FIRST ST))
ST))))
(DE LOOKUP1 (N NAMES VALUES ST)
(COND ((MT-Y NAMES)(LOOKUP N (REST ST)))
((EQ N (GET-N NAMES))(GET-V VALUES))
(T (LOOKUP1 N
(TAIL NAMES)
(TAIL VALUES)
ST))))
.END
Notice that we have written an "abstract" version of the evaluator; no
details of the representation have sullied our definition. Most of the
business should be fathomable with a bit of persistence. What may resist
your prodding is the business of parameter passing and its related
behavior of variable evaluation.
When a functional object is recognized --a (LAMBDA ...) expression-- in
APPLY, we bind the evaluated actual parameters to the formal parameters
and evaluate the body. During the evaluation of that body we may need the
current value of a formal parameter; that process is handled by LOOKUP.
That function looks through the lists of bindings.
Summary of Passive Functional Objects
The simplicity of the functional language --a LISP subset was used to
discuss the novelty of allowing functions as passive data objects; this
balanced a simple, almost trivial language, against a novel and elegant
idea (traditional and most other functional formalisms do not allow
functions as data items).
The EVAL/APPLY pair show an elegant, important, as well as
useful application of
the representation: one can succinctly describe the evaluation
(implementation) of the language IN THE LANGUAGE ITSELF. It's more than
"cute"; its a fundamental result in computer science.
Now we have a "first-order" understanding of functionality, where by
"first-order", we mean that though functional objects are available as
data structures, we have not really exploited that fact. All functions
that we have written so far have only passed data items as parameters and
returned data items as values; we have not explored the possibility of
allowing functions as arguments to other functions or functions that
produce functional results.
Active Functional Objects
Notice that, though
EVAL and APPLY operate on functional objects, they only do so in a passive
sense: they examine their structure and simulate their execution; they do not
execute them.
To activate a functional object we have to apply it; that is it has to
appear in the function position of an expression.
Let's take an example from mathematics. Assume we have a finite
sequence of integers (i↓1,#i↓2,#...i↓n), and assume we want to define a function
that will add one to each element. We could do that with:
.begin verbatim
(DE ADD-SEQ (L)
(IF (NULL L)
( )
(CONS (ADD1 (FIRST L))
(ADD-SEQ (REST L)))))
.end
Of course, we may want to perform other functional operations on such
sequences, and writing a separate function each time is tedious. So we
generalize:
.begin verbatim
(DE MAPPER (FN L)
(IF (NULL L)
( )
(CONS (FN (FIRST L))
(MAPPER FN (REST L)))))
.end
and (ADD-SEQ L) is equivalent to (MAPPER ADD1 L)
Notice now that FN is passed in as an argument (data)
and used within MAPPER as a function (active); that's new.
Further, let's assume we wish to generalize to be able to add an
arbitrary amount to each element of the sequence.
Let's make one:
.begin verbatim
(DE INC (N)
(LAMBDA (X) (PLUS N X))
.end
INC takes an integer parameter and creates a new unary function
that, when its applied, will increment ITS argument by N.
That's new: a function that creates functions! The critical
point in implementing such behavior is to make sure that value
of N is effectively captured with the functional value; this is an instance
of the famous "funarg-problem".
Now we can define our generalized incrementer:
.begin verbatim
(DE INC-SEQ (N L) (MAPPER (INC N) L))
.end
Now functions that take functions as arguments and functions that can
return functions as values, and things get more interesting in the evaluator:
to properly represent
functional objects for evaluation, the new EVAL/APPLY must capture the
current state of the symbol table when a functional object is created,
and must use that captured table when it is ready to apply the functional
object to its arguments. The reason for this added complexity involves the
question of "scoping rules". The problem and its original solution was
discovered by the LISP community in the early 1960's. They showed that
functionality is captured only by including the "free variables"
(like N in INC) of a
function as part-and-parcel of the function. That is, a function is the
text plus the environment.
.begin verbatim
(DE EVAL (X ENV)
(COND ((IS-CONST X) (VAL X))
((IS-VAR X) (LOOKUP X ENV))
((IS-IF X) (IF (EVAL (PREM X) ENV)
(EVAL (CONSE X) ENV)
(EVAL (ANTE X) ENV)))
((IS-LAM X)(MK-FUNARG X ENV))
(T (APPLY (EVAL (FUN X) ENV)
(EVLIS (ARGS X) ENV)
ENV))))
(DE APPLY (FN EARGS ENV)
(COND ((IS-PRIM FN) (DOIT FN EARGS))
((IS-FUNARG FN) (EVAL (BODY FN) ENV))))
(MK-ENV (FORM FN)
EARGS
(ENVIR FN)))
.end
This result is interesting for two reasons; first it solves a problem that
plagued(s) traditional languages that attempted to implement functional
objects (note: a language (e.g. Algol) can have functional objects without
having a data structure representation for them --Occam's razor says its a
bad idea though). For example, both Algol and Pascal put serious
restrictions on the use of "procedure-valued" variables. These
restrictions are based on implementation considerations and violate a
principle called "first-class" treatment of data structures. This
principle says that whenever a data structure is to be introduced into a
language, it must be done in such a way that any instance of that
structure will be totally unrestricted in how it may participate in a
computation; instances must (at least) be able to be passed as parameters
and returned as values. For example, your favorite language probably
allows integers as first-class data (but i'll bet it has a restriction on
the size of the integer!); but does it handle arrays and records in a
civilized way? Can you return one of these objects as a value? Probably
not. Can you pass procedures as parameters to other procedures? Return
procedure values?
.comment
. discussion of dynamic, static, packages, levels
.;
More generally however, this idea of "capturing state" with a functional
object extends beyond the "free variable" problem; it gives us an elegant
expression of the ideas behind the discussions of "object-oriented
programming" and "message passing"; these ideas are applications of
first-class functional objects. This is the subject of the next section.
(and for you real doubting Thomas'es we will relate all this strange
material to the mechanisms behind Visicalc!)
.comment .Imperative features and non-rec. control
. local assigment to control objects
.iteration
.macros
. extended control
.catch-throw
. control as data --cf. function as data
. basis for intelligent systems?
. circumspection/introspection ---doyle
.;
Property lists
Even in the earliest LISP implementations (1959-1960) it was recognized
that our mild-mannered symbol names could be carriers of useful
information. LISP views symbol names much like words in a dictionary,
associating alternate usages and their corresponding definitions with each
word. LISP generalized this idea and allows one to associate with each
word (or symbol) a collection of "attributes" and "values". For example,
with the symbol BLOCK-1 we might associate its SHAPE, say SQUARE, and its
COLOR, perhaps RED. All these associated properties and values are called
the symbol's "property list", which you may think of as a collection of
pairs: each with one attribute and one value. LISP supplies appropriate
operations to sample, set, and remove properties.
Functional Flesh
We have now constructed the skeleton of our functional approach to
programming; we are poised to reap the benefits. This section may go by
in a rather whirlwind fashion. If you attend the tutorials, the details
will appear; if you don't attend, then this quick tour should give you
enough perspective to put things together from the bibliographic
references.
To paraphrase Alan Kay, "LISP is the lowest level of assembly language
that any reasonable person would ever desire"; LISP is also a desirable
perspective from which to understand and view the vistas that have
developed in the last twenty years. We begin with an ancient LISP artifact
called the "property list".
Data-Driven Programming
Armed with property-lists, we return to ponder the evaluator; ponder,
ponder. We notice that EVAL has this large conditional expression that
"dispatches" to a piece of code to handle the evaluation of the
appropriate expression; we also notice that this kind of programming does
not lend itself to "extensibility". That is, if we wish to add a new
construct to the language we must add a new segment to EVAL to describe how
to evaluate instances of that construct. That addition would require
rewriting EVAL; it would require modifying the PRINT routine so we could
print these new objects. In general, any existing code that wanted to know
about these new objects would have to be modified. Loss. More ponder,
ponder. Ponderosa! [a tree joke]
Since each object in LISP has a type --constant,
variable, conditional, etc -- let each type be represented in LISP as an
atom, and let's put on the property list of each type, information that
tells us how to operate on elements of that type. So for example, on the
property-list of COND we would put the pair whose attribute was the atom
EVAL and whose "value" was a function to evaluate conditional expressions.
What happens to the EVAL function? It almost disappears.
.begin verbatim
(DE EVAL (EXP ENV)
((GET (GET-TYPE EXP) 'EVAL) EXP ENV))
.end
where (GET-TYPE EXP) returns a type-name and (GET type-name 'EVAL)
retrieves a function to evaluate the instance of type-name. There are
several benefits from this technique --called data-driven programming--;
in particular, it simplifies the extensibillity problem. To add a new type
we add a new "type-manager"; we need not modify existing code. The effect
is to localize type-driven computations in the type manager and
communication between types is done through the property-list. This is
real distributed processing.
This technique is the beginning of the trail to object-oriented
programming. Object-oriented means that the object (think "data") has
control of its own destiny. The traditional view could be called
"function"- or "action"-oriented, meaning that the function (in this case
EVAL) is free to ravage its argument. That hardly seems proper in a polite
society; rather one should, it's claimed by the message-passing and
object-oriented schools, politely ask the object for a response to a query
(or message). If the data feels so inclined, a response will be
forthcoming.
Notice that our data-driven EVAL is not object-oriented in this sense:
EVAL, not EXP, has control. A subtle change and the tables are turned:
Object-Oriented Programming
(EVAL EXP) goes to (EXP 'EVAL), meaning "to EVAL the expression EXP, send
the message EVAL to the object EXP". Now EXP is active, and EVAL is only a
message. Of course, nothing is for free; the complexity of EXP has
increased, but the change of perspective can be quite powerful as we shall
see.
What is EXP now? It's a functional object; and it's an instance of the
class to which EXP belongs.
As an example, here's a version of the sequence
primitives in this programming style.
.begin verbatim
(DE FIRST (X) (X 'FIRST))
(DE REST (X) (X 'REST))
(DE CONS (X Y)
(LAMBDA (MSG)
(COND ((EQ MSG 'FIRST) X)
((EQ MSG 'REST) Y))))
.end
The idea of active data takes a bit of getting used to, and the idea of
functions returning functions as values. But these are two of the key
ideas behind the Smalltalk phenomenon.
If you're worried about syntax, do you like the following
Smalltalk 76 version better?
.BEGIN VERBATIM
Class new title: 'CONS';
fields 'first rest'.
Access to fields
first[↑first]
rest[↑rest]
.END
Of course, objects can become much more complex in Smalltalk, but even
then the functional object explanation carries the day. In general, one
can view "message passing" as an application of functional programming
where the target of the message is a constructed functional object that
contains local state information plus a table of messages that the target
can respond too. It's that "simple".
More Property Lists: Hierarchies
We return to the kernel LISP dialect, and mine some additional gold from
it and property lists. Assume we wish symbol names to represent people;
for example, let Ruth be represented by the symbol RUTH. As a person, she
therefore has certain characteristics: height, weight, age, etc. We can
represent them as attribute-value pairs on her property list: AGE:45,
WEIGHT:102, HEIGHT:64, ... If we expect to develop a large personnel data
base, this process can get quite expensive; expensive both in terms of
time to specify all these components --the attributes are shared by all
people, men and women--, and expensive in terms of space --each employee
has a quantity of redundant information. That information is redundant in
the sense that many attributes are inherited (or usually inheritable) from
a class to which the individual belongs. For example, RUTH would inherit
several biological properties because she is in the class WOMAN; she would
also inherit several properties from the super-class of HUMAN. It is
absurd to propogate all the applicable properties of these super-classes
onto the property list of every individual in the base. A more rational
approach is to extend the property-list (data base facility) so that
properties may be accessed by virtue of being inherited from above. Thus
the simple (GET symbol attribute) operation can invoke a search through a
class hierarchy. And clearly before this can happen we must implement a
class hierarchy. One type of class
hierarchy is present in the Smalltalk
family of languages; there are classes and sub-classes, and objects
are instances of these classes. For example, refer to the previous
CONS example; we created cons-cells by instantiating the class, CONS.
Frame Languages
A similar, but not identical, class-like system is also available in the
newer AI languages (written as extensions to LISP) called Frame Languages.
There one finds, "slots" or "frames" to hold property-value pairs,
these frames are related in a hierarchical structure.
One further feature of a Frame language (e.g. KRL, FRL, XRL) is the extended
notion of property-list access and manipulation: the access to a data item
(the value of an attribute) may invoke a complex computation. This is
another case of data being a trivial subcase of a procedure invocation
(recall our earlier discomfort with Algorithm = Program + Data?) Another
generalization makes that discomfort more pronounced: our "data base" will
become a "procedure base".
Return to our personnel file; assume there is a property of individuals
that is relatively easy to compute but whose storage requirements are
humongous. Rather than store the value we would rather store the procedure
to compute that value. Then if that value is needed, we exercise the
procedure, compute the value, and free the storage. Frame languages allow
such procedure objects in the "data" base. This particular kind of object
is called an "IF-NEEDED" method, because ...
Two further kinds of procedures (or methods) find application: the IF-ADDED and
IF-REMOVED methods. These procedure types are called "demons" because
there are never explicity invoked, but are installed to monitor particular
kind of activity in the base. For example, if we decided to add a new
employee to the base we should expect several IF-ADDED demons to be
invoked to update data-base components to maintain consistency.
Demons can also be invoked even more implicitly: by pattern. A method of
the form:
(IF-ADDED pattern body)
is a type of demon that will be invoked whenever
an object is added to the base and that object has a structure that will
match the pattern. The pattern may be an arbitrarily complex structure
including match variables. If a match succeeds then any values associated
with the match variables will be available for the execution of the body
of the method. Note there is no function name since we never call the
method explicitly. This demon technique is called "pattern-directed
invocation" and was brought into favor in the work of Carl Hewitt.
Constraints
One obvious application of the IF-x demons is to help maintain the
consistency of the data base; operations that modify the base may have
secondary effects and we can envision collections of these demons
scurrying about to carry out these operations while watching for
inconsistencies. This data base activity is a special case of a general
view of computing. Let us assume that we know what it means for the base
to be consistent (books should balance, number of employees should be the
same as the sum of employees assigned to each department, etc.) and assume
that the base is consistent at the current instant. If we add an employee,
decrease the budget, or remove a department then that activity will have
to affect many other components in the base if we expect to maintain
consistency. This means that there are "constraints" specified between
objects in the base and consistency will require that if we perturb
objects in the base then the perturbation must ripple to any constrained
cohorts. In general that ripple effect can propagate arbitrarily before
a steady-state is reached or before an inconsistency is announced. The
effect is to be programming with a set of equations and the programming
problem is to find a simultaneous solution for that set.
In the simple case the perturbation is successful and we have a new
consistent base. You should see
a similarity to Visicalc's behavior in this: a very simple data base
with a simple set of constraints, and the consistent steady-state is
viewed graphically. It is an elegant simple application of constraint-based
programming. To hark back to a previous topic, constraint based
programming is no stranger to the school of logic programming; Prolog-like
languages view their data base of assertions and recursion equations as a
simultaneously satisfiable set and one may use equations like y=x↑2 either
to compute x, given y, or compute y, given x. This is called the
"invertibility" of logic programs.
In the difficult case --inconsistency-- we should expect to query the
system and isolate the inconsistency. This brings up an interesting
viewpoint: "Systems must be responsible for their conclusions and able to
explain how the conclusions were derived." [15] How novel; an accountable
system! An important consideration as systems become more complex and more
pervasive. One may view the trail of justifications for actions taken by
the system as deductions in a formal system, and we've come full circle.
Not quite, because the logical properties of such systems are non-standard
in the sense that theorems may become non-theorems in an
alice-in-wonderland world called "non-monotonic logic". However, systems
--Truth Maintenance Systems-- are being studied and being built.
In this brief paper we have had only a taste of the rich culture
surrounding LISP. There will be an intensive course given in the Western
Computer Science Institute, held this summer at Santa Clara University.
Other courses are planned.
Topics to be covered in those courses (not covered here or in the
tutorial) include:
Implementation: architecuture of a LISP machine and storage management.
Theory: provability of programs, functional programming.
Programming environments: Interactive programming and its methodology.
LISP extensions: Scheme, Frame languages, Data/procedure-base maintenance.
Applications: Artificial Intelligence, Symbol manipulation systems.
In summary, we've only begun.
.comment implementation issues
. see sched
. address space
. big!
.
. typing of data, not identifiers
. strong type vs. type free
. tags
.
. bibop (tlc-lisp)
. why
. why not
.
.. gc
. why: cf ref count
. how: mark-sweep
. compact
. real-tiime
. multi-process
.
. shallow/deep binding
. trade-offs
.. baker(?)
.
.;
Bibliography
1. Allen, J., Anatomy of LISP, McGraw Hill, 1978.
2. Allen, J. "The Bankruptcy of BASIC", submitted for publication.
3. Backus, J., "Functional Programming", CACM, August 1978.
4. Byte Magazine, Special LISP Issue, August 1979.
5. Charniak, et al, Artificial Intelligence Programming Lawrence Erlbaum,
1980.
*. Clarke, A, 2001
6. Conference Record of the 1980 LISP Conference.
.begin indent 2,2,2
Goldstein, I. and Bobrow D., "Extending Object-Oriented Programming In
Smalltalk"
Steele, G., and Sussman, G., "The Dream of a Lifetime: A Lazy Variable
Extent Mechanism"
.end
7. Creative Computing, Special Issues on Actor languages, October-November
1980.
8. Culler, G., "Function-Oriented On-line Analysis" 1962 Workshop on
Computer Organization.
9. Henderson, P., Functional Programming, Prentice Hall, 1980.
10. Hofstadter, D., Godel, Escher, Bach: An Eternal Golden Braid, Vintage
Books, 1980
** Ingalls, D., The Smalltalk-76 Programming System Design and Implementation,
POPL 1978., pp9-15.
11. Iverson, K. "Notation as a Tool of Thought", CACM, August 1980.
12. Kowalski, R., Logic for Problem Solving, North Holland, 1979.
13. Pirsig, R., Zen and the Art of Motorcycle Maintenance, Bantam Books,
1972.
** Smith, H. Kamongo
14. Spengler, O., The Decline of the West, Knopf 1962.
15. Steele, G., and Sussman, G., "Constraints",MIT AI Memo #502, November
1978.
16. Winston, P., LISP, Addison Wesley, 1980.